Towards a formal framework for linguistic annotations
نویسندگان
چکیده
‘Linguistic annotation’ is a term covering any transcription, translation or annotation of textual data or recorded linguistic signals. While there are several ongoing efforts to provide formats and tools for such annotations and to publish annotated linguistic databases, the lack of widely accepted standards is becoming a critical problem. Proposed standards, to the extent they exist, have focussed on file formats. This paper focuses instead on the logical structure of linguistic annotations. We survey a wide variety of annotation formats and demonstrate a common conceptual core. This provides the foundation for an algebraic framework which encompasses the representation, archiving and query of linguistic annotations, while remaining consistent with many alternative file formats.
منابع مشابه
A formal framework for linguistic annotation
Linguistic annotation" covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions – audio, video and/or physiological recordings – or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, "named entit...
متن کاملTowards Adaptation of Linguistic Annotations to Scholarly Annotation Formalisms on the Semantic Web
This paper explores how and why the Linguistic Annotation Framework might be adapted for compatibility with recent more general proposals for the representation of annotations in the Semantic Web, referred to here as the Open Annotation models. We argue that the adapted model, in addition to being interoperable with other annotations and annotation tools, also resolves some representational lim...
متن کاملOntology-Based Interface Specifications for a NLP Pipeline Architecture
The high level of heterogeneity between linguistic annotations usually complicates the interoperability of processing modules within an NLP pipeline. In this paper, a framework for the interoperation of NLP components, based on a data-driven architecture, is presented. Here, ontologies of linguistic annotation are employed to provide a conceptual basis for the tag-set neutral processing of ling...
متن کاملTowards International Standards for Language Resources
This paper describes the Linguistic Annotation Framework (LAF) developed by the International Standards Organization TC32 SC4, which is to serve as a basis for harmonizing existing language resources as well as developing new ones. We then describe the use of the LAF to represent the American National Corpus and its linguistic annotations.
متن کاملThe Effects of Multimedia Annotations on Iranian EFL Learners’ L2 Vocabulary Learning
In our modern technological world, Computer-Assisted Language learning (CALL) is a new realm towards learning a language in general, and learning L2 vocabulary in particular. It is assumed that the use of multimedia annotations promotes language learners’ vocabulary acquisition. Therefore, this study set out to investigate the effects of different multimedia annotations (still picture annotatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998